Show total amount of purchases in each month by each company

Here we create a stacked bar graph to visualize how many purchases occured by every month for each company. We can clearly the the numbers in white for each customer. The stacked bar graph is in order exactly like the legend so as you move down each square you would follow along with the legend.

Purchase_Data %>%
  ggplot(aes(x = format(Date, "%Y-%m"), fill = Company))+
  geom_bar(color = 'Black') +
  labs(x = 'Month', y = "Count of Purchases by Month", title = "How many Purchases by Companies every Month")+
  geom_text(stat = 'count', aes(label=..count..), size = 3, position = position_stack(vjust = 0.5), color="white")

Billing for each Company at the end of the month. This is the largest bill that Stamps will collect in the entire year

We want find the value of our bills we need to charge at the end of each month. The bill is calculated at the sum of the purchases a company creates minus the value of the payment they made in the same period. In our data payments are denoted as (-) so in our data we do an addition. Finally we want to see what the maximum amout of money we will collect in this year will be.

Bill_Data <- Purchase_Data %>%
  group_by(month, Company) %>%
  summarise(sum_spend = sum(Spend, na.rm = TRUE), sum_payment = sum(Payment, na.rm = TRUE), Bill = sum_spend + sum_payment)
max(Bill_Data$Bill) 
## [1] 9059216

What is the dollar amount of each customers highest bill

Now we want to see what is the dollar amount of the higest bill for each customer. This is the amount of money that each company owes.

Bill_Data %>%
  group_by(Company) %>%
  summarise(max_bill = max(Bill)) 
## # A tibble: 19 x 2
##    Company  max_bill
##    <chr>       <dbl>
##  1 Ashley    3797052
##  2 Daniel    5552764
##  3 David     3175100
##  4 Eddy      6547000
##  5 Frank     7165700
##  6 George    4702180
##  7 Helen     5999811
##  8 Jennifer  5535789
##  9 Karen     8128001
## 10 Lizzy      609204
## 11 Mary      7590764
## 12 Michael   4443700
## 13 Nancy     2562226
## 14 Patricia   700211
## 15 Scott     9059216
## 16 Sean      5700200
## 17 Susan     4427353
## 18 Tom       5870400
## 19 Walter    4119100

Profit margins is a flat percentage. 1st, 4th, 6th profitable person

Now we want to see which companies are the most profitable for us. So we look at the total amount of purchases that is made by each specific company. Then we order them from the largest total purchase to the smallest and return the values we are looking for: The most profitable companies.

Profitable_margin_customers <- Purchase_Data %>%
  group_by(Company) %>%
  summarise(total_purchase = sum(Spend, na.rm = TRUE)) %>%
  arrange(desc(total_purchase))
Profitable_margin_customers[1,]
## # A tibble: 1 x 2
##   Company total_purchase
##   <chr>            <dbl>
## 1 Scott         68201664
Profitable_margin_customers[4,]
## # A tibble: 1 x 2
##   Company total_purchase
##   <chr>            <dbl>
## 1 David         58145300
Profitable_margin_customers[6,]
## # A tibble: 1 x 2
##   Company total_purchase
##   <chr>            <dbl>
## 1 Mary          52393670

Credit Exposure

Since most companies are not paying their entire orders in one payment, but have spread them out over an amount of time. We need to determine what our exposure is, that is, how much would we be in debt if our customers suddenly went bankrupt. We can see that some months we are (-) which means that these months companies made enough payments to push us into cash flow positive territory.

Bill_Data %>%
  group_by(month) %>%
  summarise(credit_exposure = sum(Bill))
## # A tibble: 12 x 2
##    month credit_exposure
##    <chr>           <dbl>
##  1 01           53271317
##  2 02           32906230
##  3 03           24391771
##  4 04           20712338
##  5 05           23166734
##  6 06            3896696
##  7 07          - 3090934
##  8 08          -22527905
##  9 09            5832091
## 10 10           42366148
## 11 11          -47902177
## 12 12          - 3728272

Most Valueable Customer YTD. If we go off the same as in question 4 and choose 1, 4, 6

We have looked at the most valueable customer when it comes to the total amount of purchases made, but now we are looking for the most valueable customers that are making the most payments on their purchases.

Customer_Payments <- Purchase_Data %>%
  group_by(Company) %>%
  summarise(total_payments = sum(Payment, na.rm = TRUE)) %>%
  arrange(total_payments)
Customer_Payments[1,]
## # A tibble: 1 x 2
##   Company total_payments
##   <chr>            <dbl>
## 1 Scott        -63209095
Customer_Payments[4,]
## # A tibble: 1 x 2
##   Company total_payments
##   <chr>            <dbl>
## 1 Mary         -51800000
Customer_Payments[6,]
## # A tibble: 1 x 2
##   Company total_payments
##   <chr>            <dbl>
## 1 Sean         -49523100

The Highest Payments Per Month by each Company

We can break down our information of payments to look at who is making the highest payment by month.

Customer_Payments_month1 <- Purchase_Data %>%
  group_by(Company, month) %>%
  summarise(total_payments = sum(Payment, na.rm = TRUE)) %>%
  group_by(month) %>%
  summarise(total_payments = min(total_payments))
Customer_Payments_month2 <- Purchase_Data %>%
  group_by(Company, month) %>%
  summarise(total_payments = sum(Payment, na.rm = TRUE))
Customer_Payments_month <- left_join(Customer_Payments_month1, Customer_Payments_month2, by = c('month', 'total_payments'))

Customer_Payments_month
## # A tibble: 12 x 3
##    month total_payments Company 
##    <chr>          <dbl> <chr>   
##  1 01         - 4272000 Lizzy   
##  2 02         -10000000 Scott   
##  3 03         - 9045600 Frank   
##  4 04         - 9000000 Scott   
##  5 05         -10000000 Walter  
##  6 06         -12169922 Daniel  
##  7 07         -13400000 Jennifer
##  8 08         -13068510 George  
##  9 09         - 5108500 David   
## 10 10         - 6240000 Michael 
## 11 11         -45000000 Mary    
## 12 12         - 9923400 Eddy

Total Amount of Purchases by each Company for each Billing Cycle

Here we want to see what the total amount of purchases in dollar value is for each company in the billing cycle beginning on the 16th of every month and ending on the 15th.

Billing_Cycle <- left_join(January_Billing_Cycle, February_Billing_Cycle, by='Company') %>%
  left_join(., March_Billing_Cycle, by = "Company") %>%
  left_join(., April_Billing_Cycle, by='Company') %>%
  left_join(., May_Billing_Cycle, by = "Company") %>%
  left_join(., June_Billing_Cycle, by = 'Company') %>%
  left_join(., July_Billing_Cycle, by= 'Company') %>%
  left_join(., August_Billing_Cycle, by = "Company") %>%
  left_join(., September_Billing_Cycle, by = "Company") %>%
  left_join(., October_Billing_Cycle, by = "Company") %>%
  left_join(., November_Billing_Cycle, by = "Company")
Billing_Cycle
## # A tibble: 19 x 12
##    Company   January February   March   April     May    June    July
##    <chr>       <dbl>    <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
##  1 Ashley    3098183  5745574 3138154 4348063 4111703 2894911 1805020
##  2 Daniel    5456225  4291621 5720926 5752545 6390772 4940389 6886136
##  3 David     2628400  4991700 4945100 5000600 4431500 2214500 5753400
##  4 Eddy      4261600  1369600 5117200 2938300 6096200 6313500 2228200
##  5 Frank     4209800  3626900 9199100 4813100 1254100 2907300 5482600
##  6 George    4755520  1926960 2246930 2627170 5117280 3466190 3450860
##  7 Helen     3568047  5011199 3311189 3927736 3635539 4842149 3960818
##  8 Jennifer  2719514  1572964 2797997 3438740 3562300 4700405 6040675
##  9 Karen     2659484  6536110 6626620 3754103 3879444 1436659 3575490
## 10 Lizzy     5855749  5027807 3960768 1804324 3085114 1635378 1164010
## 11 Mary      3342360  5376971 6827838 4410938 3980587 3242041 4948087
## 12 Michael   2989700  6417600 3788500 2579400 3942500 5546300 3118400
## 13 Nancy     5617931  2673948 4897152 3668491 3387715 3482032 5612720
## 14 Patricia  6038053  2981111 3234581 3738710 2083005 1936890 5131643
## 15 Scott    12252657  3909827 8462021 4997324 4598806 5246486 3186489
## 16 Sean      6280400  3491600 4441500 5407500 4825400 3790000 2399400
## 17 Susan     3972893  2992308 2487797 4302904 1402766 4096346 2547483
## 18 Tom       4363200  5899200 2883800 4415500 4467100 3712900  635700
## 19 Walter    5893200  3083800 5308200 5162900 1810600 2496300 8944200
##     August September October November
##      <dbl>     <dbl>   <dbl>    <dbl>
##  1 2535477   1346772 2784222  7039906
##  2 3160526   2591146 5102280  3832227
##  3 7170400   4590400 6339000  4653100
##  4 5576500   4020200 4868600   671700
##  5 3600200   3745700 2895000  4855900
##  6 2092020   2261770 1616300  3822770
##  7 2213471   3778384 7238745  3211923
##  8 2138248   3935939 5368405  1634365
##  9 2435576   4453286 3445842  3363250
## 10 2602199   4014915 2463920  4102647
## 11 3038473   2587975 7442342  3927270
## 12 3153300   4720900 4178000  4313000
## 13 4176496   1760638 2159807  6102830
## 14 3977498   4717999 2740970  3272324
## 15 3649871   5208052 4077504  7006828
## 16 4497000   3857800 9199300  5509400
## 17 5343591   3589804 6727223  2599159
## 18 2182300   6500700 4291700  4677800
## 19 3967400   4853200 6625000  3386800

Visualizing our Data

Here we want to see how the purchases of all the companies look across the entire year of 2017. Plotting each respective company we can go ahead and smooth out the data to see if we can determine any form of trend by each company. This will be a start for our machine learning to come for the forecasting of the year of 2018 by each company. We can see some form of patterns in the data as some companies tend to decrease their orders while others have a strong consistency of orders.

Purchase_Data %>%
  ggplot(aes(x = Date, y = Spend, group = 1))+
  facet_wrap(~ Company, ncol = 2, scales = 'free')+
  geom_point(aes(color = day_of_week), na.rm = TRUE)+
  geom_line(na.rm = TRUE, show.legend = FALSE, color = palette_light()[[1]], alpha = 0.8)+
  geom_smooth(na.rm = TRUE, color = 'blue')+
  scale_color_manual(values = palette_light())+
  theme_tq()+
  labs(color = "Days of the Week", title = "Purchases From The Entire Year by each Company", x = 'Days')
## `geom_smooth()` using method = 'loess'

Forecasting for the Year of 2018

Here we are forecasting the pruchases of the entire year of 2018. We are not considering specific companies, we keep every else constant and only focus on the total sales. We can see that next year the purchases will start off strong with an increase in purchases and continue to increase until about half-way through the year. Then we can expect that purchases will begin to slow down into December but still the total amount of purchases will be at a much higher average than that of 2017. We can only make this slight prediction since our data set is only for a year so we cannot determine if there is a consistent patter, we would have to analyze more data in order to see if there is some sort of trend of every year there being a total increase in the whole amount of purchases, then proceed to slow down and consolidate.

purchase_spend %>%
  select(Date, total_spend) %>%
  rename(date = Date) %>%
  ggplot(aes(x = date, y = total_spend))+
  geom_point(alpha = .5)+
  geom_line(alpha = .5)+
  geom_ribbon(aes(ymin = lo.95, ymax = hi.95), data = pred_future,
              fill = '#D5DBFF', color = NA, size = 0)+
  geom_ribbon(aes(ymin = lo.80, ymax = hi.80, fill = key), data = pred_future,
              fill = "#596DD5", color = NA, size = 0, alpha =.8)+
  geom_point(aes(x = date, y = (total_spend)), data = pred_future,
             alpha = .5, color = palette_light()[2], na.rm = TRUE)+
  geom_smooth(aes(x = date, y = (total_spend)), data = pred_future,
              method = 'loess', color = 'white', na.rm = TRUE)+
  theme_tq()